Systems and methods for robotic control under contact

ABSTRACT

In variants, a method for robot control can include: receiving sensor data of a scene, modeling the physical objects within the scene, determining a set of potential grasp configurations for grasping a physical object within the scene, determining a reach behavior based on the potential grasp configuration, determining a trajectory for the reach behavior, and grasping the object using the trajectory.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. Pat. Application No.16/943,884, filed 30-JUL-2020, which claims priority under 35 U.S.C.119(e) to U.S. Provisional Pat. Application No. 62/882,395, filed02-AUG-2019, U.S. Provisional Pat. Application No. 62/882,396, filed02-AUG-2019 and U.S. Provisional Pat. Application No. 62/882,397, filedAug. 2, 2019 the disclosures of which are hereby incorporated herein byreference in their entirety as if set forth in full.

This application is also related to U.S. Pat. Application Serial No.16/943,915, filed on 30-JUL-2020, entitled ROBOTIC MANIPULATORS, and isalso related to U.S. Pat. Application Serial No. 16/944,020, filed on30-JUL-2020, entitled ROBOTIC SYSTEM FOR PICKING AND PLACING OBJECTSFROM AND INTO A CONSTRAINED SPACE, all of which are incorporated hereinby reference in their entirety as if set forth in full.

BACKGROUND 1. Technical Field

The embodiments described herein are related to robotic control systems,and more specifically to a robotic software system for accurate controlof robots that physically interact with various objects in theirenvironments, while simultaneously incorporating the force feedback fromthese physical interactions into a “control policy”.

2. Related Art

It is currently very hard to build automated machines for manipulatingobjects of various shapes, sizes, inertias, and materials. Withinfactories robots perform many kinds of manipulation on a daily basis.They lift massive objects, move with blurring speed, and repeat complexperformances with unerring precision. Yet outside of these carefullycontrolled robot realms, even the most sophisticated robot would beunable to perform many tasks that involve contact with other objects.Everyday manipulation tasks would stump conventionally controlledrobots. As such, outside of controlled environments, robots have onlyperformed sophisticated manipulation tasks when operated by a human.

Within simulation, robots have performed sophisticated manipulationtasks such as grasping multifaceted objects, tying knots, carryingobjects around complex obstacles, and extracting objects from piles ofentangled objects. The control algorithms for these demonstrations oftenemploy search algorithms to find satisfactory solutions, such as a pathto a goal state, or a configuration of a gripper that maximizes ameasure of grasp quality against an object.

For example, many virtual robots use algorithms for motion planning thatrapidly search for paths through a state space that describes thekinematics and dynamics of the world. Almost all of these simulationsignore the robot’s sensory systems and assume that the state of theworld is known with certainty. As examples, they might be provided withgreater accuracy of the objects’ states, e.g., positions and velocities,than is obtainable using state-of-the-art sensors, they might beprovided with states for objects that, due to occlusions, are notvisible to sensors, or both.

In a carefully controlled environment, these assumptions can be met. Forexample, within a traditional factory setting, engineers can ensure thata robot knows the state of relevant objects in the world to accuracysufficient to perform necessary tasks. The robot typically needs toperform a few tasks using a few known objects, and people are usuallybanned from the area while the robot is working. Mechanical feeders canenforce constraints on the pose of the objects to be manipulated. And inthe event that a robot needs to sense the world, engineers can make theenvironment favorable to sensing by controlling factors such as thelighting and the placement of objects relative to the sensor. Moreover,since the objects and tasks are known in advance, perception can bespecialized to the environment and task. Whether by automated planningor direct programming, robots perform exceptionally well in factories orother controlled environments. Within research labs, successfuldemonstrations of robots autonomously performing complicatedmanipulation tasks have relied on some combination of known objects,easily identified and tracked objects (e.g., a bright red ball),uncluttered environments, fiducial markers, or narrowly defined, taskspecific controllers.

Outside of controlled settings, however, robots have only performedsophisticated manipulation tasks when operated by a human. Throughteleoperation, even highly complex humanoid robots have performed avariety of challenging everyday manipulation tasks, such as graspingeveryday objects, using a power drill, throwing away trash, andretrieving a drink from a refrigerator.

But accurate control of robots that autonomously, physically interactwith various objects in their environments has proved elusive.

SUMMARY

Systems and methods for controlling machines to accurately manipulateobjects that can be effectively modeled by rigid body kinematics anddynamics are described herein.

A system comprises a database; at least one hardware processor coupledwith the database; and one or more software modules that, when executedby the at least one hardware processor, receive at least one of sensorydata from a robot and images from a camera, identify and build models ofobjects in an environment, wherein the model encompasses immutableproperties of identified objects including mass and geometry, andwherein the geometry is assumed not to change, estimate the stateincluding position, orientation, and velocity, of the identifiedobjects, determine based on the state and model, potentialconfigurations, or pre-grasp poses, for grasping the identified objectsand return multiple grasping configurations per identified object,determine an object to be picked based on a quality metric, translatethe pre- grasp poses into behaviors that define motor forces andtorques, communicate the motor forces and torques to the robot in orderto allow the robot to perform a complex behavior generated from thebehaviors.

These and other features, aspects, and embodiments are described belowin the section entitled “ Detailed Description. ”

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a diagram illustrating an example an environment in which arobot can be controlled in accordance with one embodiment;

FIG. 2 is a diagram illustrating an example robot that can be used inthe environment of FIG. 1 and controlled in accordance with oneembodiment; and

FIG. 3 is a diagram illustrating the software modules for controlling arobot within an environment such as depicted in FIG. 1 , in accordancewith one example embodiment.

DETAILED DESCRIPTION

Systems, methods, and non-transitory computer-readable media aredisclosed for robot control. The disclosure and the various features andadvantageous details thereof are explained more fully with reference tothe non-limiting embodiments and examples that are described and/orillustrated in the accompanying drawings and detailed in the followingdescription. It should be noted that the features illustrated in thedrawings are not necessarily drawn to scale, and features of oneembodiment can be employed with other embodiments, even if notexplicitly stated herein. Descriptions of well-known components andprocessing techniques may be omitted so as to not unnecessarily obscurethe embodiments of the disclosure. The examples used herein are intendedmerely to facilitate an understanding of ways in which the disclosurecan be practiced and to further enable those of skill in the art topractice the embodiments of the disclosure. Accordingly, the examplesand embodiments herein should not be construed as limiting the scope ofthe disclosure. Moreover, it is noted that like reference numeralsrepresent similar parts throughout the several views of the drawings.

FIG. 3 illustrates a process for controlling machines to accuratelymanipulate objects that can be modeled by rigid body kinematics anddynamics, e.g., bottles, silverware, chairs, etc., and the softwaremodules configured to implement the process in accordance with oneembodiment. Rigid body dynamics can be used to accurately model evennon-stiff objects, like a rubber ball, due to the accuracy obtainable bya robot’s sensors during real-time operation. With such software, robotscan be made to, e.g., move furniture, pick up a pen, or use a wrench totighten a bolt.

The highest-level construct of the software system can be termed abehavior. A behavior consists of a control policy that maps estimates ofthe state of the world to motor signals and, optionally, a deliberativecomponent. For example, controlling a robot to reach toward an object, a“reach” behavior, in order to grasp it, requires planning a path throughspace such that the robot will not inadvertently collide with itself orits environment. After the plan has been computed, it can be input to animpedance controller, i.e., the control policy, sequentially until theplanned motion is complete, which means either that the plan has beensuccessfully executed or that execution failed. Various other kinds ofcontrol schemes will work as well, like admittance control, operationalspace control, etc.

Grasping is a key component of the system 300 illustrated in FIG. 3 .Grasping is required before performing any other manipulation tasks,e.g., it is necessary to pick up the wrench to tighten the bolt. Thegrasping system, i.e., software and hardware configured to control therobot uses a database (120 in FIG. 1 ), the “grasp database”, todetermine how various objects should be grasped. For every possibleobject that the robot needs to grasp, the grasp database 120 providesthe desired relative poses, or other proxies for this information suchas key points for the robot 104 or manipuland between the manipuland,the object to be grasped, and each of the robot links that is to contactthe manipuland, for the following stages of grasping: pre-grasp, theconfiguration prior to the grasp; grasping, the configuration during thegrasp; and release, the configuration subsequent to releasing themanipuland.

The grasp database 120 informs the reach behavior as to how the robotshould move toward the object in order to grasp it. Given a targetmanipuland as input, the reach behavior queries the grasp database 120for the pregrasp configuration. The robot then plans a motion free ofcontact with the environment, excepting the manipuland, using any one ofa number of motion planning algorithms, like RRT. It should be notedthat the choice of algorithm will affect only the time it takes to finda solution. Once a contact-free path, represented as a set of points inthe robot’s configuration space, has been obtained, polynomial splinesare fit to the points, yielding a trajectory, i.e., a time-dependentpath. The reach behavior converts the trajectory in the robot’sjoint-space into operational or task space.

In certain embodiments, it is also be possible to use a learningapproach to map a model or partial model, i.e., only the parts of themodel that are observable, of the object’s geometry and apparent surfaceproperties, e.g., friction, to an appropriate way to grasp it, asopposed to or in addition to using the grasp database 120.

Behaviors can be executed sequentially, one immediately following theother; in parallel; or both. When executed in parallel, the outputs fromall behaviors or motor signals are summed or combined. A state machineacts to switch between combinations of active behaviors, inactivebehaviors output zero motor signals, at regular intervals, e.g., at 100Hz, as a function of behaviors’ conditions, which the programmer defineson an as-behavior basis.

As an example of the entire system in operation, a state machine forpicking objects with the robot might consist of reaching and graspingbehaviors. The state machine would be initialized to an idle state.After receiving a signal with a target object to pick from an operator,which could be human or a different software program, the state machinewould transition to the reach node, activating the reach behavior. Thereach behavior would generate a trajectory to pick the object, using thegrasp database 120 described above, and then execute it. When theexecution of the reach trajectory has completed, the state machine wouldtransition to the pre- grasp state, activating the reach behavior. Thereach behavior would generate a trajectory to pick the object, using thegrasp database 120 described above, and then execute it. When theexecution of the reach trajectory has completed, the state machine wouldtransition to the grasp state, activating a grasp behavior anddeactivating the reach behavior.

The grasp behavior uses a fixed control policy and the grasp database120 to move the hand or gripper from the pre-grasp configuration to thegrasping configuration. When the grasp behavior has completedsuccessfully, indicated by detecting sufficient contact forces at thehand/gripper and absence of slip between the robot and the manipuland,the state machine would transition to the transport state, activatingthe reach behavior, in addition to the grasp behavior that is alreadyactivated.

Given a target pose in 3D space, the reach behavior again generates andexecutes a trajectory to effect this goal, at which point the statemachine will have successfully executed the pick action. While thissequence of executed behaviors represents a successful pick, the statemachine provides contingent operations for when one of the behaviorsfails to effect its goal due to, e.g., errors in estimating the state ofthe manipuland or environment or imprecision in controlling the robot.For example, if the object slips from the grasp while the final sequenceof the pick is executing, the state machine transitions back to thepre-grasp state, beginning the sequence anew.

FIG. 3 is a diagram of example software modules that can be configuredto effect the process described above. As noted with respect to FIG. 3 ,these modules can be implemented on, e.g., the processor system 550illustrated in FIG. 1 , in order to create a specialized robot controlsystem that provides accurate control of robots that can nowautonomously or semi-autonomously, physically interact with variousobjects in their environments.

Sensory data flows from the robot 104 to the system identificationprocess 302, which builds and refines models of objects (108 in FIG. 1 )in the environment (102 in FIG. 1 ), and to the state estimation process304, which estimates the state, position, orientation, and velocity, ofthe identified objects 102. A “model” encompasses immutable propertiesof an object, like mass and geometry. Since the objects are typicallystiff, geometries do not generally change, but the systems and methodsdescribed herein do not rely upon this assumption.

The outputs of these processes, i.e., object models and states, can thenbe fed into the other software modules, which consist of a graspgenerator 306 that can determine potential configurations, or pre-graspposes, for grasping the objects that have been identified in theenvironment. This process returns multiple grasping configurations,grasp data options, per identified object.

Since the grasp generator generates many potential grasps among thevarious identified objects, a mechanism is necessary to determine whichobject should be picked, when the object that should be picked isarbitrary, as is the case when, e.g., physically sorting a collection ofobjects. The grasp selector 308 can be configured to choose among thevarious grasp data options. A quality metric, e.g., which grasp requiresthe robot to move the least, can be computed for each option, and thegrasp, and associated object, is which object should be picked isarbitrary, with the highest quality is selected. Alternatively, a humanoperator can select a target object from a user interface 310, and thehighest quality grasp associated with that target will be used.

Given grasp data, and reach, grasp, and release behaviors 312 a, 312 b,and 312 c interact to perform the pick and place task. The behavioroutputs, labeled “u” in the diagram, represent motor forces/torques, andare summed together (“fused”) and sent to the robot. Combinations ofthese behaviors permit complex behavior to emerge. For example,transporting as described above, emerges from the interactions betweenthe grasp and reach behaviors: the grasp behavior maintains the grasp onthe object while the reach behavior is responsible for moving therobot’s end effector to a pose where the object will be placed.

The system can be applied to any robot for which inertial (dynamics),shape, and appearance models of the robot are available. The system’smodel can be built using a combination of CAD/CAE and systemidentification to determine best-fit parameters through physicalexperimentation. Dynamics studies the movement of systems ofinterconnected bodies under the action of external forces. The dynamicsof a multi-body system are described by the laws of kinematics and bythe application of Newton’s second law (kinetics) or their derivativeform Lagrangian mechanics. The solution of these equations of motionprovides a description of the position, the motion and the accelerationof the individual components of the system and overall the systemitself, as a function of time.

The model consists of the following information, at minimum: the objectmass, inertia matrix, i.e., set of six non-negative values that predicthow an object rotates as a function of torques applied to the object;center-of-mass location; “undeformed” geometry, i.e., the shape of theobject when it is not subject to any forces from loading; materialstiffness, dry friction coefficient(s); visual appearance through, e.g.,a bidirectional reflectance distribution function; and, if the object isarticulated, then location; type, e.g., universal, prismatic, hinge; andparameters, e.g., directional axis of any joints.

This information can be gathered from direct measurement, estimation, orboth. As one example, a user can create a geometric description of theobject manually using 3D modeling or computer-aided engineering softwareor automatically using a 3D scanner. The object mass, i.e., fromweighing the object; density information, known from materialcomposition, and a geometric model can be input to an existingalgorithm, such as described in B. Mirtich. Fast and accuratecomputation of polyhedral mass properties. J. Graphics Tools, Vol. 1,1996, which will return the center-of-mass and inertia matrix. Asanother example, the material stiffness can be estimated usingubiquitous tables, provided in engineering reference books, listingYoung’s Modulus for various materials.

FIG. 1 is a diagram illustrating an example environment 100 in which thesystems and methods described herein can be implemented. As can be seen,the system can use at least one RGB-D (color + depth) camera 105 orsimilar sensor using electromagnetic radiation, e.g., LIDAR, to be aimedinto the workspace 102 that the robot 104 will be operating in. Thecamera can be used for example to determine pose for various objects108. Alternatively, poses can be determined or estimated usingradio/electromagnetic wave triangulation, motion capture, etc.

In certain embodiments, every joint 106 of the robot 104 is instrumentedwith a position sensor (not shown), such as an optical encoder. Further,force/torque sensors (not shown), such as a 6-axis force/torque sensor,can be used to sense forces acting on the robot’s links 104. The sensorscan be placed inline between two rigid links affixed together.Alternatively, tactile skins over the surface of the robot’s rigid linkscan be used to precisely localize pressures from contact arising betweenthe robot and objects (or people) in the environment.

The camera(s) 105 and sensors 106 can be wired and/or wirelesslycommunicatively coupled to a back end server or servers, comprising oneor more processors running software as described above and below, whichin turn can be wired and/or wirelessly communicatively coupled with oneor more processors included in robot 104. The server processors runvarious programs and algorithms 112 that identify objects 108 within theimages workspace 102 that the system has been trained to identify. Forexample, a camera image may contain a corrugated box, a wrench, and asealed can of vegetables, all of which can be identified and added to amodel containing the objects in the camera’s sightline and the robot’svicinity). Server(s) 110 can be local or remote from workspace 102.Alternatively, the one or more programs/algorithms can be included inrobot 104 and can be run by the one or more processors included in robot104.

The programs/algorithms 112 can include deep neural networks that dobounding box identification from camera 105 (RGB) images to identify anddemarcate, with boxes that overlay every object that the system has beentrained to manipulate and observe in a particular image. This softwarecan also encompass software that is specialized at identifying certainobjects, like corrugated cardboard boxes, using algorithms like edgedetectors, and using multiple camera views (e.g., calibrated stereocameras) in order to get the 3D position of points in the 2D cameraimage.

When objects are unique, e.g., a 12 oz can of Maxwell House coffee, the2D bounding box from a single camera is sufficient to estimate theobject’s 3D pose. When objects instead belong to a class, e.g.,corrugated cardboard box, such that object sizes can vary, multiplecamera views, e.g., from a calibrated stereo camera setup, are needed toestablish correspondences between points in the 2D camera images andpoints in 3D. State-of-the-art techniques for training these neuralnetworks use domain randomization to allow objects to be recognizedunder various lighting conditions, backgrounds, and even objectappearances. Function approximation, e.g., deep neural networks trainedon synthetic images, or a combination of function approximation andstate estimation algorithms, can be used to estimate objects’ 3D poses,or to estimate the value of a different representation, like keypoints,that uniquely determines the location and orientation of essentiallyrigid objects from RGB-D data. For example, a Bayesian filter (like a“particle filter”) can fuse the signals from force sensors with the poseestimates output from a neural network in order to track the object’sstate position and velocity.

Function approximation, e.g., deep neural networks trained on cameraimages, can be used to estimate a dynamic, e.g., inertia, friction,etc., and geometric (shape) model of all novel objects that are trackedby the system, e.g., using the bounding boxes. The coffee can exampleused above might not require this process, because it is reasonable toexpect that every coffee can is identical to the limitations of theaccuracy brought to bear on the problem, i.e., due to the accuracyprovided by the sensors and required by the control system. By way ofcontrasting example, boxes will exhibit different surface frictiondepending on the material of the box, e.g., corrugated, plastic, etc.,and the location on the box. For example, if there is a shipping labelplaced on part of the box, then this can affect surface friction.Similarly, a neural network can infer the geometry of the obscured partof a box from a single image showing part of the box.

If an object is articulated, a kinematic model of the object can beestimated as well. Examples include doors, cabinets, drawers, ratchets,steering wheels, bike pumps, etc. The ground and any other stationaryparts of the environment are modeled as having infinite inertia, makingthem immobile. Function approximation, e.g., deep neural networkstrained on pressure fields, can be used to estimate the 3D poses ofobjects that the robot is contacting and thereby possibly obscuring theRGB-D sensor normally used for this purpose.

Kinematic commands (“desireds”) for the robot can be accepted for eachobject that the robot attempts to manipulate. The desireds can come frombehaviors. A behavior can be either a fast-to-compute reactive policy,such as a look up table that maps, e.g., the joint estimated state ofthe robot and manipuland to a vector of motor commands, or can includedeliberative components, or planners, e.g., a motion planner thatdetermines paths for the robot that do not result in contact with theenvironment. In that case of a planner, the output will be atime-indexed trajectory that specifies position and derivatives for therobot and any objects that the robot wants to manipulate.

In turn, the planner can use high level specifications, e.g., put thebox on the table, to compute the output trajectories. This process iswhere motion planning comes into play.

By inverting the dynamics model of the robot (from a=F/m to F=ma) andmodeling contact interactions as mass-spring-damper systems, the forcesnecessary to apply to the robots actuators can be computed in order toproduce forces on contacting objects, and thereby move both them and therobot as commanded.

If the force/torque data is available, then the sensed forces on therobot can be compared against the forces predicted by the dynamicsmodel. If, after applying some filtering as necessary, the forces arelargely different, the robot can halt its current activity and act tore- sense its environment, i.e., reconcile its understanding of thestate of its surroundings with the data it is perceiving. For example, agrasped 1 kg box might slip from the robot’s end effectors’ grasp whilethe robot is picking the object. At the time that the object slips fromthe robot’s grasp, the robot’s end effector would accelerate upward,since less force would be pulling the end effector downward, while thedynamics model, which assumes the object is still within the object’sgrasp, might predict that the end effector would remain at a constantvertical position. When the disparity between the actual end-effectoracceleration and predicted end-effector acceleration becomes greaterthan the model’s bounds of accuracy, it becomes clear that the estimatedmodel state is direly incorrect. For a picking operation, we expect thismismatch to occur due to a small number of incidents: an object has beeninadvertently dropped, multiple objects have been inadvertently grasped,e.g., the robot intended to grab one object but grabbed two, a humanentered the workspace and was struck by the robot, or the robotinaccurately sensed the workspace, causing it to inadvertently collidewith the environment, i.e., the robot failed to sense an object’sexistence or it improperly parameterized a sensed object, e.g.,estimating a box was small when it was really large.

The behaviors in this system, as well as the controllers, the perceptionsystem, and the conditions for transitioning between states in the statemachine all use robot and environment- specific numbers (parameters).For example, controllers use gains to determine how quickly errorsshould be corrected; stiff (large) gains correct errors quickly at theexpense of possible damage to the robot or environment if the error isdue to inadvertent contact between the robot and environment. All suchopen parameters, which are state-dependent, i.e., they generally shouldchange dynamically in response to the conditions of the robot andenvironment, are optimally computed to maximize the robot’s taskperformance by solving an optimal control problem. Since the optimalcontrol problem generally requires too much computation to solve, evenoffline, approximations are computed instead. Approximations includeusing dynamic programming along with discretizing the state and actionspaces and reinforcement learning algorithms, e.g., the policy gradientalgorithm. Our system uses simulations, given the detailed physicalmodels previously described, to perform these optimizations and computeperformant parameters offline. Further optimization can be performedonline: parameters can be adjusted based on actual task performance,measured using sensory data. Such transfer learning can even useperformance of similar, but not identical, robots on similar, but notidentical, tasks in order to adjust parameters.

FIG. 2 is a block diagram illustrating an example wired or wirelesssystem 550 that can be used in connection with various embodimentsdescribed herein. For example the system 550 can be used to implementthe robot control system described above and can comprise part of therobot 104 or backend servers 110. The system 550 can be a server or anyconventional personal computer, or any other processor-enabled devicethat is capable of wired or wireless data communication. Other computersystems and/or architectures may be also used, as will be clear to thoseskilled in the art.

The system 550 preferably includes one or more processors, such asprocessor 560. Additional processors may be provided, such as anauxiliary processor to manage input/output, an auxiliary processor toperform floating point mathematical operations, a special-purposemicroprocessor having an architecture suitable for fast execution ofsignal processing algorithms (e.g., digital signal processor), a slaveprocessor subordinate to the main processing system (e.g., back-endprocessor), an additional microprocessor or controller for dual ormultiple processor systems, or a coprocessor. System 550 can alsoinclude a tensor processing unit as well as motion planning processorsor systems.

Such auxiliary processors may be discrete processors or may beintegrated with the processor 560. Examples of processors which may beused with system 550 include, without limitation, the Pentium®processor, Core i7® processor, and Xeon® processor, all of which areavailable from Intel Corporation of Santa Clara, California.

The processor 560 is preferably connected to a communication bus 555.The communication bus 555 may include a data channel for facilitatinginformation transfer between storage and other peripheral components ofthe system 550. The communication bus 555 further may provide a set ofsignals used for communication with the processor 560, including a databus, address bus, and control bus (not shown). The communication bus 555may comprise any standard or non-standard bus architecture such as, forexample, bus architectures compliant with industry standard architecture(ISA), extended industry standard architecture (EISA), Micro ChannelArchitecture (MCA), peripheral component interconnect (PCI) local bus,or standards promulgated by the Institute of Electrical and ElectronicsEngineers (IEEE) including IEEE 488 general-purpose interface bus(GPIB), IEEE 696/5-100, and the like.

System 550 preferably includes a main memory 565 and may also include asecondary memory 570. The main memory 565 provides storage ofinstructions and data for programs executing on the processor 560, suchas one or more of the functions and / or modules discussed above. Itshould be understood that programs stored in the memory and executed byprocessor 560 may be written and/or compiled according to any suitablelanguage, including without limitation C/C ++ , Java, JavaScript, Pearl,Visual Basic, .NET, and the like. The main memory 565 is typicallysemiconductor-based memory such as dynamic random access memory (DRAM)and/or static random access memory (SRAM). Other semiconductor-basedmemory types include, for example, synchronous dynamic random accessmemory (SDRAM), Rambus dynamic random access memory (RDRAM),ferroelectric random access memory (FRAM), and the like, including readonly memory (ROM).

The secondary memory 570 may optionally include an internal memory 575and/or a removable medium 580, for example a floppy disk drive, amagnetic tape drive, a compact disc (CD) drive, a digital versatile disc(DVD) drive, other optical drive, a flash memory drive, etc. Theremovable medium 580 is read from and/or written to in a well-knownmanner. Removable storage medium 580 may be, for example, a floppy disk,magnetic tape, CD, DVD, SD card, etc.

The removable storage medium 580 is a non-transitory computer-readablemedium having stored thereon computer executable code (i.e., software)and/or data. The computer software or data stored on the removablestorage medium 580 is read into the system 550 for execution by theprocessor 560.

In alternative embodiments, secondary memory 570 may include othersimilar means for allowing computer programs or other data orinstructions to be loaded into the system 550. Such means may include,for example, an external storage medium 595 and an interface 590.Examples of external storage medium 595 may include an external harddisk drive or an external optical drive, or and external magneto-opticaldrive.

Other examples of secondary memory 570 may include semiconductor-basedmemory such as programmable read-only memory (PROM), erasableprogrammable read-only memory (EPROM), electrically erasable read-onlymemory (EEPROM), or flash memory (block oriented memory similar toEEPROM). Also included are any other removable storage media 580 andcommunication interface 590, which allow software and data to betransferred from an external medium 595 to the system 550.

System 550 may include a communication interface 590. The communicationinterface 590 allows software and data to be transferred between system550 such as possibly robot 104, camera 105 or other sensors, as well asexternal devices (e.g. printers), networks, or information sources. Forexample, computer software or executable code may be transferred tosystem 550 from a network server via communication interface 590.Examples of communication interface 590 include a built-in networkadapter, network interface card (NIC), Personal Computer Memory CardInternational Association (PCMCIA) network card, card bus networkadapter, wireless network adapter, Universal Serial Bus (USB) networkadapter, modem, a network interface card (NIC), a wireless data card, acommunications port, an infrared interface, an IEEE 1394 fire-wire, orany other device capable of interfacing system 550 with a network oranother computing device.

Communication interface 590 preferably implements industry promulgatedprotocol standards, such as Ethernet IEEE 802 standards, Fiber Channel,digital subscriber line (DSL), asynchronous digital subscriber line(ADSL), frame relay, asynchronous transfer mode (ATM), integrateddigital services network (ISDN), personal communications services (PCS),transmission control protocol/Internet protocol (TCP/IP), serial lineInternet protocol/point to point protocol (SLIP/PPP), and so on, but mayalso implement customized or non-standard interface protocols as well.

Software and data transferred via communication interface 590 aregenerally in the form of electrical communication signals 605. Thesesignals 605 are preferably provided to communication interface 590 via acommunication channel 600. In one embodiment, the communication channel600 maybe a wired or wireless network, or any variety of othercommunication links. Communication channel 600 carries signals 605 andcan be implemented using a variety of wired or wireless communicationmeans including wire or cable, fiber optics, conventional phone line,cellular phone link, wireless data communication link, radio frequency(“RF”) link, or infrared link, just to name a few.

Computer executable code (i.e., computer programs or software) is storedin the main memory 565 and/or the secondary memory 570. Computerprograms can also be received via communication interface 590 and storedin the main memory 565 and/or the secondary memory 570. Such computerprograms, when executed, enable the system 550 to perform the variousfunctions of the present invention as previously described.

In this description, the term “computer readable medium” is used torefer to any non- transitory computer readable storage media used toprovide computer executable code (e.g., software and computer programs)to the system 550. Examples of these media include main memory 565,secondary memory 570 (including internal memory 575, removable medium580, and external storage medium 595), and any peripheral devicecommunicatively coupled with communication interface 590 (including anetwork information server or other network device). Thesenon-transitory computer readable mediums are means for providingexecutable code, programming instructions, and software to the system550.

In an embodiment that is implemented using software, the software may bestored on a computer readable medium and loaded into the system 550 byway of removable medium 580, I/O interface 585, or communicationinterface 590. In such an embodiment, the software is loaded into thesystem 550 in the form of electrical communication signals 605. Thesoftware, when executed by the processor 560, preferably causes theprocessor 560 to perform the inventive features and functions previouslydescribed herein.

In an embodiment, I/O interface 585 provides an interface between one ormore components of system 550 and one or more input and / or outputdevices. Example input devices include, without limitation, keyboards,touch screens or other touch-sensitive devices, biometric sensingdevices, computer mice, trackballs, pen-based pointing devices, and thelike. The input device can also be the camera 105 or other sensorswithin environment 102 as well as robot 104. Examples of output devicesinclude, without limitation, cathode ray tubes (CRTs), plasma displays,light-emitting diode (LED) displays, liquid crystal displays (LCDs),printers, vacuum fluorescent displays (VFDs), surface-conductionelectron-emitter displays (SEDs), field emission displays (FEDs), andthe like.

The system 550 also includes optional wireless communication componentsthat facilitate wireless communication over a voice and over a datanetwork. The wireless communication components comprise an antennasystem 610, a radio system 615 and a baseband system 620. In the system550, radio frequency (RF) signals are transmitted and received over theair by the antenna system 610 under the management of the radio system615.

In one embodiment, the antenna system 610 may comprise one or moreantennae and one or more multiplexors (not shown) that perform aswitching function to provide the antenna system 610 with transmit andreceive signal paths. In the receive path, received RF signals can becoupled from a multiplexor to a low noise amplifier (not shown) thatamplifies the received RF signal and sends the amplified signal to theradio system 615.

In alternative embodiments, the radio system 615 may comprise one ormore radios that are configured to communicate over various frequencies.In one embodiment, the radio system 615 may combine a demodulator (notshown) and modulator (not shown) in one integrated circuit (IC). Thedemodulator and modulator can also be separate components. In theincoming path, the demodulator strips away the RF carrier signal leavinga baseband receive audio signal, which is sent from the radio system 615to the baseband system 620.

If the received signal contains audio information, then baseband system620 decodes the signal and converts it to an analog signal. Then thesignal is amplified and sent to a speaker. The baseband system 620 alsoreceives analog audio signals from a microphone. These analog audiosignals are converted to digital signals and encoded by the basebandsystem 620. The baseband system 620 also codes the digital signals fortransmission and generates a baseband transmit audio signal that isrouted to the modulator portion of the radio system 615. The modulatormixes the baseband transmit audio signal with an RF carrier signalgenerating an RF transmit signal that is routed to the antenna systemand may pass through a power amplifier (not shown). The power amplifieramplifies the RF transmit signal and routes it to the antenna system 610where the signal is switched to the antenna port for transmission. Thebaseband system 620 can also be communicatively coupled with theprocessor 560.

Radio system 615 can for example be used to communicate with robot 104,camera 105, as well as other sensors.

The central processing unit 560 has access to data storage areas 565 and570. The central processing unit 560 is preferably configured to executeinstructions (i.e., computer programs or software) that can be stored inthe memory 565 or the secondary memory 570. Computer programs can alsobe received from the baseband processor 610 and stored in the datastorage area 565 or in secondary memory 570, or executed upon receipt.Such computer programs, when executed, enable the system 550 to performthe various functions of the present invention as previously described.For example, data storage areas 565 may include various software modules(not shown).

Various embodiments may also be implemented primarily in hardware using,for example, components such as application specific integrated circuits(ASICs), or field programmable gate arrays (FPGAs). Implementation of ahardware state machine capable of performing the functions describedherein will also be apparent to those skilled in the relevant art.Various embodiments may also be implemented using a combination of bothhardware and software.

Furthermore, those of skill in the art will appreciate that the variousillustrative logical blocks, modules, circuits, and method stepsdescribed in connection with the above described figures and theembodiments disclosed herein can often be implemented as electronichardware, computer software, or combinations of both. To clearlyillustrate this interchangeability of hardware and software, variousillustrative components, blocks, modules, circuits, and steps have beendescribed above generally in terms of their functionality. Whether suchfunctionality is implemented as hardware or software depends upon theparticular application and design constraints imposed on the overallsystem. Skilled persons can implement the described functionality invarying ways for each particular application, but such implementationdecisions should not be interpreted as causing a departure from thescope of the invention. In addition, the grouping of functions within amodule, block, circuit or step is for ease of description. Specificfunctions or steps can be moved from one module, block or circuit toanother without departing from the invention.

Moreover, the various illustrative logical blocks, modules, functions,and methods described in connection with the embodiments disclosedherein can be implemented or performed with a general purpose processor,a digital signal processor (DSP), an ASIC, FPGA or other programmablelogic device, discrete gate or transistor logic, discrete hardwarecomponents, or any combination thereof designed to perform the functionsdescribed herein. A general- purpose processor can be a microprocessor,but in the alternative, the processor can be any processor, controller,microcontroller, or state machine. A processor can also be implementedas a combination of computing devices, for example, a combination of aDSP and a microprocessor, a plurality of microprocessors, one or moremicroprocessors in conjunction with a DSP core, or any other suchconfiguration.

Additionally, the steps of a method or algorithm described in connectionwith the embodiments disclosed herein can be embodied directly inhardware, in a software module executed by a processor, or in acombination of the two. A software module can reside in RAM memory,flash memory, ROM memory, EPROM memory, EEPROM memory, registers, harddisk, a removable disk, a CD-ROM, or any other form of storage mediumincluding a network storage medium. An exemplary storage medium can becoupled to the processor such that the processor can read informationfrom, and write information to, the storage medium. In the alternative,the storage medium can be integral to the processor. The processor andthe storage medium can also reside in an ASIC.

Any of the software components described herein may take a variety offorms. For example, a component may be a stand-alone software package,or it may be a software package incorporated as a “tool” in a largersoftware product. It may be downloadable from a network, for example, awebsite, as a stand-alone product or as an add-in package forinstallation in an existing software application. It may also beavailable as a client-server software application, as a web-enabledsoftware application, and/or as a mobile application.

While certain embodiments have been described above, it will beunderstood that the embodiments described are by way of example only.Accordingly, the systems and methods described herein should not belimited based on the described embodiments. Rather, the systems andmethods described herein should only be limited in light of the claimsthat follow when taken in conjunction with the above description andaccompanying drawings.

We claim:
 1. A method comprising: receiving a set of sensor data; withthe sensor data, determining a set of physical objects in anenvironment; determining a virtual model of a physical object of theset, wherein the virtual model comprises immutable properties ofphysical object; determining a state estimate for the physical object;based on the state estimate and virtual model, determining a set ofpotential grasp configurations for grasping the physical object; basedon the set of potential grasp configurations, determining a reachbehavior associated with a collision-free path towards the physicalobject relative to a remainder of the set of physical objects in theenvironment; and determining a trajectory for a robot based on the reachbehavior.
 2. The method of claim 1, wherein the virtual model comprisesan object mass and an object geometry.
 3. The method of claim 2, whereinthe set of physical objects in the environment are determined with anobject detector, wherein the object mass is estimated based on an objectclass associated with the object detector.
 4. The method of claim 1,wherein determining the trajectory comprises: fitting a set of splinesto points of the collision-free path in a configuration space of therobot.
 5. The method of claim 1, further comprising: selecting thephysical object as a grasp target from the set of physical objects basedon a set of heuristics; and, based on the selection of the physicalobject as the grasp target, controlling the robot based on thetrajectory.
 6. The method of claim 5, wherein the set of heuristicscomprises: a grasp quality heuristic or a path length optimization. 7.The method of claim 1, wherein the collision-free path is determinedbased on an environmental collision evaluation based on pre-computedparameter values for simulated robot behaviors.
 8. The method of claim7, wherein the environmental collision evaluation is based on a lookuptable.
 9. The method of claim 7, wherein the pre-computed parametervalues are determined using reinforcement learning for discretized stateand action spaces.
 10. The method of claim 1, wherein the virtual modelcomprises a rigid-body kinematic model for the physical object.
 11. Themethod of claim 1, further comprising: determining a set of robotbehaviors based on the potential grasp configurations, the set of robotbehaviors comprising at least one of: a pre-grasp behavior, a graspbehavior, a transport behavior, or a release behavior.
 12. The method ofclaim 11, wherein the reach behavior defines, based on the pre-graspposes, how the robot should move toward the object in order to grasp it.13. The method of claim 1, wherein the reach behavior defines a set ofmotor forces and torques for joints of the robot.
 14. A methodcomprising: receiving a set of sensor data; with the sensor data,identifying a physical object within a surrounding environment;determining a virtual model and a state estimate for the physicalobject, the virtual model comprising an object mass and an objectgeometry; based on the state estimate and virtual model, determining aset of potential grasp configurations for grasping the physical object;and determining a pre-grasp trajectory for a robot to achieve at leastone potential grasp configuration of the set, wherein determining thepre-grasp trajectory comprises enforcing a non-collision constraintbetween the robot and the surrounding environment.
 15. The method ofclaim 14, further comprising: identifying a plurality of physicalobjects within the surrounding environment based on the sensor data; anddetermining the surrounding environment for the physical object, whereinthe surrounding environment is associated with a virtual model and astate estimate for each physical object of the plurality.
 16. The methodof claim 14, wherein the object model comprises estimated values for aset of immutable object properties, wherein the object geometry andobject mass are modeled as immutable object properties.
 17. The methodof claim 14, wherein determining the pre-grasp trajectory comprises:fitting a set of splines to points of the collision-free path in aconfiguration space of the robot.
 18. The method of claim 14, whereindetermining the pre-grasp trajectory comprises an environmentalcollision evaluation based on pre-computed parameter values forsimulated robot behaviors.
 19. The method of claim 18, wherein theenvironmental collision evaluation is based on a lookup table.
 20. Themethod of claim 18, wherein the pre-computed parameter values aredetermined using reinforcement learning for discretized state and actionspaces.